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Abstract 



We discuss necessary and sufficient conditions for a sensing matrix to be "s-good" - to allow for exact 
fT^ ' £i-recovery of sparse signals with s nonzero entries when no measurement noise is present. Then we 

express the error bounds for imperfect £i-recovery (nonzero measurement noise, nearly s-sparse signal, 
near-optimal solution of the optimization problem yielding the ^i-recovery) in terms of the characteristics 
underlying these conditions. Further, we demonstrate (and this is the principal result of the paper) that 
these characteristics, although difficult to evaluate, lead to verifiable sufficient conditions for exact sparse 
f~i . £i-recovery and to efficiently computable upper bounds on those s for which a given sensing matrix is s- 

"j^ I good. We establish also instructive links between our approach and the basic concepts of the Compressed 

Sensing theory, like Restricted Isometry or Restricted Eigenvalue properties. 

CN 1 Introduction 

> : 

' In the existing literature on sparse signal recovery and Compressed Sensing (see [4-10,18-22] and references 

. therein) the emphasis is on assessing sparse signal w G M" from an observation y G R'^ (in this context 

^ ; A: <C n): 

o\ ■ 

O; y = Aw + i, m\<e, (1.1) 

^P- , where || • || is a given norm on R*^, ^ is the observation error and e > is a given upper bound on the error 
I magnitude, measured in the norm || • ||. One of the most popular (computationally tractable) estimators 
^ ' which is well suited for recovering sparse signals is the ii-recovery given by 

c3 ■ 

■ - - ' w £ argmin^ {|k||i : \\Az — y\\ < e} . (1.2) 

The existing Compressed Sensing theory focuses on this estimator and since our main motivation comes 
from the Compressed Sensing, we will also concentrate on this particular recovery. It is worth to mention 
that other closely related estimation techniques are used in statistical community, the most renown examples 
are "Dantzig Selector" (cf. [5]), provided by 

G argmin^ {llzlli : \\A^ {Az - y)\\oo < e} , (1.3) 

and Lasso estimator, see [21, 4], which under sparsity scenario exhibits similar behavior. 

The theory offers strong results which state, in particular, that if w is s-sparse (i.e., has at most s nonzero 
entries) and A possesses a certain well-defined property, then the £i-recovery of w is close to w, provided 
the observation error e is small. For instance, necessary and sufficient conditions of exactness of ^i-recovery 
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in the case of noiseless observation (when e = 0) has been established in [23, 16, 15]. Specifically, in [2.3] it 
is shown that w is the unique solution of the noiseless £i-recovery problem 

min{||z||i : Az = Aw} . (1.4) 

z 

if and only if the kernel KerA of the sensing matrix is strict s -balanced, the latter meaning that for any set 
/c{l,...,n}of cardinality < s it holds 

l^il < l^il for any z E Ker^ (1-5) 

(what the above condition is sufficient for the £i-recovery to be exact in the noiseless case was stated in 
[14]). 

Some particularly impressive results make use of the Restricted Isometry property which is as follows: 
a k X n matrix A is said to possess the Restricted Isometry (RI(5, m)) property with parameters 5 G (0, 1) 
and m, where m is a positive integer, if 

Vl — (^ll^^lb ^ ll^^lb ^ \/T+~^||x||2 for all X € M"' with at most m nonzero entries. (1-6) 

For instance, the following result is well known ([H), Theorem 1.2] or [9, Theorem 4.1]): let || • || in (1.1) be 
the Euclidean norm || • ||2, and let the sensing matrix A satisfy RI((5, 2s)-property with 6 < \/2 — 1. Then 

\\w-w\\i < 2{l- py^[ae^/^+{l + p)\\w-w'\\l] (1.7) 

where a = p = and is obtained from w by zeroing all but the s largest in absolute values 

entries. The conclusion is that when A is Rl{6,2s) with 6 < \f2 — 1, £i-recovery reproduces well signals 
with small s-tails (small \\W — w^\\\), provided that the observation error is small. Even more impressive is 
the fact that there are k x n sensing matrices A which possess, say, the RIP(l/4, 2s)-property for "large" 
s - as large as O {k / ln{n / k)) . For instance, this is the case, with overwhelming probability, for matrices 
obtained by normalization (dividing columns by their || • ||2-norms) of random matrices with i.i.d. standard 
Gaussian or ±1 entries, as well as for normalizations of random submatrices of the Fourier transform or 
other orthogonal matrices. 

On the negative side, random matrices are the only known matrices which possess the RI(5, 2s)- prop- 
erty for such large values of s. For all known deterministic families oi k x n matrices provably pos- 
sessing the RI((5, 2s)-property, one has s = 0{Vk) (see [13]), which is essentially worse than the bound 
s = 0(1) {k/ln{n/k)) promised by the Rl-based theory. Moreover, Rl-property itself is "intractable" - the 
only currently available technique to verify the RI((5, m) property for a fe x n matrix amounts to test all its 
k X m submatrices. In other words, given a large sensing matrix A, one can never be sure that it possesses 
the RI((^, m)-property with a given m ^ 1. 

Certainly, the Rl-property is not the only property of a sensing matrix A which allows to obtain good 
error bounds for ^i-recovery of sparse signals. Two related characteristics are the Restricted Eigenvalue 
assumption introduced in [4] and the Restricted Correlation assumption of [3], among others. However, 
they share with the Rl-property not only the nice consequences as in (1.7), but also the drawback of being 
computationally intractable. To summarize our very restricted and sloppy description of the existing results 
on £i-recovery, neither strict s-balancedness, nor Restricted Isometry, or Restricted Correlation assumption 
and the like, do allow to answer affirmatively the question whether for a given sensing matrix A, an accurate 
£i-recovery of sparse signals with a given number s of nonzero entries is possible. 

Now, suppose we face the following problem: given a sensing matrix A, which we are allowed to modify 
in certain ways to obtain a new matrix A, our objective is, depending on problem's specifications, either the 
maximal improvement, or the minimal deterioration of the sensing properties of A with respect to sparse 
£i-recovery. As a simple example, one can think, e.g., of a 2- or 3-dimensional n-point grid E of possible 
locations of signal sources and an A^-element grid R of possible locations of sensors. A sensor at a given 
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location measures a known linear form of the signals emitted at the nodes of E which depends on location, 
and the goal is to place a given number k < N of sensors at the nodes of R in order to be able to recover, 
via the £i-recovery, all s-sparse signals. Formally speaking, we are given an x n matrix A, and our goal 
is to extract from it a k x n submatrix A which is s-good - such that whenever the true signal w in (1.1) 
is s-sparse and there is no observation error = 0), the £i-recovery (1.2) recovers w exactly. To the best 
of our knowledge, the only existing computationally tractable techniques which allow to approach such a 
synthesis problem are those based on mutual incoherence 

of a A; X re sensing matrix A with columns Ai (assumed to be nonzero). Clearly, the mutual incoherence can 
be easily computed even for large matrices. Moreover, bounds of the same type as in (1.7) can be obtained 
for matrices with small mutual incoherence: a matrix A with mutual incoherence IJ,{A) and columns Aj of 
unit II • ||2-norm satisfies RI((5, m) assumption (1.6) with 6 = (m — l)iJ,{A). Unfortunately, the latter relation 
implies that /x should be very small to certify the possibility of accurate ^i-recovery of non-trivial sparse 
signals, so that the estimates of a "goodness" of sensing for £i-recovery based on mutual incoherence are 
very conservative. 

The goal of this paper is to provide new computationally tractable sufficient conditions for sparse recov- 
ery. 

The overview of our main results is as follows. 

1. Let for X G M" 

||x||s 1 = max > \xi\ 
Card(/)<. 

stand for the sum of s maximal magnitudes of components of x. Set 

7s(A) = max{||j;||s i : ||x||i < 1, Ax = 0} . 

X ' 

Starting from optimality conditions for the problem (1.4) of noiseless ^i-recovery, we show that A 
is s-good if and only if 7s(^) < 1/2, thus recovering some of the results of [23]. While 7s(^) is 
fully responsible for ideal £i-recovery of s-sparse signals under ideal circumstances, when there is no 
observation error in (1.1) and (1.2) is solved to precise optimality, in order to cope with the case of 
imperfect ^i-recovery (nonzero observation error, nearly s-sparse true signal, (1.2) is not solved to 
exact optimality), we embed the characteristic 7s (^) into a single-parametric family of characteristics 
%{A,(3), < /? < oo. Here 

%{A,(3) = max {\\x\\s,i - f3\\Ax\\ : ||a;||i < 1} 

X 

(note that 7s(^, /3) is nonincreasing in /? and is equal to 7s(^) for all large enough values of (3). We 
then demonstrate (Section 3) that whenever /3 < oo is such that 7s (^, P) < 1/2, the error of imperfect 
^i-recovery u) admits an explicit upper bound, similar in structure the Rl-based bound (1.7): 

||w-w||i < (1 -27(A,/3))~^[2/3(e) +2||t(;-u;^||i + u] 

where e is the measurement error and u is the inaccuracy in solving (1.2). 

2. The characteristics 7s (^, /3) is still difficult to compute. In Section 4, we develop efficiently computable 
lower and upper bounds on ^s{A,f3). In particular, we show that the quantity as{A,P), 



as(A, 13) := min < max ||(I — Y ^)e,-||s i : WuiW* < /3, 1 < i < 

y=[yi,...,y„]eM'=>'" [l<j<n 



3 



(here || • ||* is the norm conjugate to || • ||) is an upper bound on 7s(j4, s/3). 

This bound provides us with an efficiently verifiable (although perhaps conservative) sufficient con- 
dition for s-goodness of A, namely, as{A, 13) < 1/2. We demonstrate that our verifiable sufficient 
conditions for s-goodness are less restrictive than those based on mutual incoherence. On the other 
hand, the proposed lower bounds on 7s(A, /3) allow to bound from above the values of s for which A 
is s-good. 

We also study limitations of our sufficient conditions for s-goodness: unfortunately, it turns out that 
these conditions, as applied to a A: x n matrix A, cannot justify its s-goodness when s > 2V2k, unless 
A is "nearly square". While being much worse than the theoretically achievable, for appropriate ^'s, 
level 0{k/ \n{n/k)) of s for which A may be s-good, this "limit of performance" of our machinery 
nearly coincides with the best known values of s for which explicitly given individual s-good k x n 
sensing matrices are known. 

3. In Section 5, we investigate the implications of the RI property in our context. While these implications 
do not contribute to the "constructive" part of our results (since the RI property is difficult to verify) , 
they certainly contribute to better understanding of our approach and integrating it into the existing 
Compressed Sensing theory. The most instructive result of this Section is as follows: whenever A is, 
say, RI(l/4, m) (so that the A is s-good for s = O(l)m), our verifiable sufficient conditions do certify 
that A is 0(l)-y/m-good - they guarantee "at least the square root of the true level s of goodness". 

4. Section 6 presents some very preliminary numerical illustrations of our machinery. These illustrations, 
in particular, present experimental evidence of how significantly this machinery can outperform the 
mutual-incoherence-based one - the only known to us existing computationally tractable way to certify 
goodness. 

When this paper was finished, we become aware of the preprint [12] which contain results closely related 
to some of those in our paper. The authors of [12] have "extracted" from [11] the sufficient condition 
"jsiA) < 1/2 for s-goodness of A and proposed an efficiently computable upper bound on 7s (^) based on 
semidefinite relaxation. This bound is essentially different from our, and it could be interesting to find out 
if one of these bounds is "stronger" than the other. 



2 Characterizing s-goodness 

2.1 Characteristics 7s( ) and 7s( ): definition and basic properties 

The "minimal" requirement on a sensing matrix A to be suitable for recovering s-sparse signals (that is, 
those with at most s nonzero entries) via ^i-minimization is as follows: whenever the observation y in (1.2) 
is noiseless and comes from an s-sparse signal w: y = Aw, w should be the unique optimal solution of the 
optimization problem in (1.2) where e is set to 0. This observation motivates the following 

Definition 1 Let A be a k x n matrix and s be an integer, < s < n. We say that A is s-good, if for every 
s-sparse vector w G M", w is the unique optimal solution to the optimization problem 

min {\\x\\i : Ax = Aw] . (2.9) 

Let s*(^) be the largest s for which A is s-good; this is a well defined integer, since by trivial reasons every 
matrix is 0-good. It is immediately seen that s*(A) < min[/c,n] for every k x n matrix A. 
From now on, || • || is the norm on M.^ and || • ||* is its conjugate norm: 

\\y\\* = max\v^y : H-yH < l| . 
We are about to introduce two quantities which are "responsible" for s-goodness. 
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Definition 2 Let A be a k x n matrix, (3 G [0, oo] and s < n be a nonnegative integer. We define the 
quantities -js {A, 13) , ^s{A, /3) as follows: 

(i) 'ys{A, f3) is the infinum o/ 7 > such that for every vector z G M" with s nonzero entries, equal to ±1, 
there exists a vector y G M.^ such that 

iw.<^&(^^!/).{=i\^i_ (2.10) 

If for some z as above there does not exist y with \\y\\* < /3 such that A^y coincides with z on the support 
of z, we set 7s (A, /3) = 00. 

(ii) 7s(74, /3) is the infinum of ^ > such that for every vector z £ with s nonzero entries, equal to ±1, 
there exists a vector y G R.^ such that 

\\y\\* < P k \\A^y- z\\oo <-/. (2.11) 

To save notation, we will slcip indicating f3 when /3 = 00, thus writing 7s(^) instead of 7s(A, cx)), and 
similarly for 7s. 

Several immediate observations are in order: 

A. It is easily seen that the set of the values of 7 participating in (i-ii) are closed, so that when 7s (A, f3) < 00, 
then for every vector z G with s nonzero entries, equal to ±1, there exists y such that 

\\y\U<^k (A^y), { = (A R\^(A RM ^ I ^^.12) 
Similarly, for every z as above there exists y such that 

||y||* </3& \\A^y-z\\^<%{A,f3). (2.13) 

B. The quantities 7s(A, /3) and 7s(j4,/3) are convex nonincreasing functions of /3, < /3 < 00. Moreover, 
from A it follows that for a given A, s and all large enough values of /3 one has 7s(^, /3) = 7s (^) and 
%{A,(3)=%{A). 

C. Taking into account that the set {A^y : \\y\\* < /3} is convex, it follows that if 7s(^, /3) < 00, then the 
vectors y satisfying (2.12) exist for every s-sparse vector z with H^Hoo < Ij not only for vectors with exactly 
s nonzero entries equal to ±1. Similarly, vectors y satisfying (2.13) exist for all s-sparse z with H^Hoo ^ 1- 
As a byproduct of these observations, we see that 7s(A, /?) and 7s(^, /3) are nondecreasing in s. 

Our interest in the quantities 7s(', •) and 7s(-, •) stems from the following 

Theorem 1 Let A be a k x n matrix and s < n be a nonnegative integer. 

(i) A is s-good if and only if 'ys{A) < 1. 

(ii) For every /3 G [0, 00] one has 



(a) j:=j^{A,(3)<l^%[A,j^f^J=j^<l/2; 

(b) 7:=7,(^,/3)<1/2^7s(a,^/3)=^<1. 

The proof of Theorem 1 is given in Appendix A. 

Theorem 1 explains the importance of the characteristic 7s(-) in the context of £i-recovery. However, it 
is technically more convenient to deal with the quantity 7s(-)- 

2.2 Equivalent representation of 7s(A) 

According to Theorem 1 (ii), the quantities 7s(-) and 7(-) are tightly related. In particular, the equivalent 
characterization of s-goodness in terms of 7s(^) reads as follows: 

A is s-good <^ ^s{A) < 1/2. 

In the sequel, we shall heavily utilize an equivalent representation 7s (A, /3) which, as we shall see in Section 
4, has important algorithmic consequences. The representation is as follows: 
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Theorem 2 Consider the polytope 

Ps = {ueR'^ : \\u\\i < s, ll-ulloo < !}• 



One has 



In particular, 



%{A,^) =ma^{u^x - f3\\Ax\\ : u e Ps, ||x||i < l} . (2.15) 



%{A) = maxjti^x : u £ Pg, \\x\\i < 1, Ax = O] . (2.16) 

u,x ^ ' 

Proof. By definition, 7s(A, /3) is the smallest 7 such that the closed convex set C^,/? := A'^Bjj + jB, where 
Bfj = {w G M*"' : \\w\\^: < /?} and B = {v M" : ||f ||oo ^ I}, contains all vectors with s nonzero entries, 
equal to ±1. This is exactly the same as to say that C^,/? contains the convex hull of these vectors; the latter 
is exactly Pg- Now, 7 satisfies the inclusion Pg C C^,/3 if and only if for every x the support function of Pg 
is majorized by that of C^,/?, namely, for every x one has 

maxu^x < max y'^x = max \x'^ A^w + jx'^v : llwlL < /?, \\v\\oo < l| 
uePs 2/eC(7,/3) ■w,v 

= /3Px||+7||x||i. (2.17) 

with the convention that when /3 = 00, /3||^x|| is 00 or depending on whether \\Ax\\ > or \\Ax\\ = 0. 
That is, Pg C C^^p if and only if 

max(n'^x — < 7||x||i. 

By homogeneity w.r.t. x, it is equivalent to 

max {u^x — /3\\Ax\\ : u £ Pg, \\x\\i < l| < 7. 

u,x ^ ' 

Thus, 7s (vl) is the smallest 7 for which the concluding inequality takes place, and we arrive at (2.15), (2.16). 



Recall that for x G M", is the sum of the s largest magnitudes of entries in x, or, equivalently, 

II II T 
\\x\\s 1 = maxti X. 

uePs 

Combining Theorem 1, and Theorem 2, we get the following 

Corollary 1 For a matrix A G M'^^"' one has jg{A) = max{||3;||s 1 : Ax = 0, ||a;||i < 1}, 1 < s < n. As a 

X ' 

result, matrix A is s-good if and only if the maximum of \\ ■ \\g^i-norms of vectors x G Ker(A) with \\x\\i = 1 
is < 1/2. 



Note that (2.15) and (2.16) can be seen as an equivalent definition of 7s(A, /3), and one can easily prove 
Corollary 1 without any reference to Theorem 1, and thus without a necessity even to introduce the char- 
acteristic 7s(j4,/3). However, we believe that from the methodological point of view the result of Theorem 
1 is important, since it reveals the "true origin" of the quantities 7s(-) and 7s(-) as the entities coming from 
the optimality conditions for the problem (2.9). 
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3 Error bounds for imperfect £i-recovery via 7 

We have seen that the quantity 7s (^) (or, equivalently, 7s (^)) is responsible for s-goodness of a sensing 
matrix A, that is, for the precise £i-recovery of an s-sparse signal w in the "ideal case" when there is no 
measurement error and the optimization problem (2.9) is solved to exact optimality. It appears that the 
same quantities control the error of £i-recovery in the case when the vector w € M" is not s-sparse and the 
problem (2.9) is not solved to exact optimality. To see this, let , s < n, stand for the best, in terms 
of £i-norm, s-sparse approximation of w. In other words, is the vector obtained from w by zeroing all 
coordinates except for the s largest in magnitude. 

Proposition 1 Let A be a k xn matrix, 1 < s < n and let 7s (^) < 1/2 (or, which is the same, 7s (^) < Ij- 
Let also x be a u-optimal approximate solution to the problem (2.9), meaning that 

Ax = Aw and \\x\\i < Opt(^tt;) + 

where OY>i{Aw) is the optimal value of (2.9). Then 

z^+2|^^^-u^ l + 7s(^) r ^^1, SIM 

\\--Mh< i_29,(A) = T3^[- + 2|l--- 111]- 

Proof. Let z = x — w and let / be the set of indices of s largest elements of w (i.e., the support of w^). 
Denote by the vector, obtained from x (z) by replacing by zero all coordinates of x (z) with the 

indices outside of /. As Az = 0, by Corollary 1, 

Ik^'^lli < ||z||s,i <7.(^)ll^lli- 

On the other hand, w \s a, feasible solution to (2.9), so Opt(Aw) < ||w||i, whence 

II will + v>\\w + z\\i = ll'u;'' +z('*)||i + \\{w- w^) + {z- z^''^)\\i > Ww^Wi - Wz'-^^Wi + \\z - z^^^Wi - \\w -w^Wi, 
or, equivalently. 



Thus, 



and, as 7s (^) < 1/2, 



z-z^'h\i < ||z(^)||i + 2||u;-w;"||i + z/. 



\z\\^ = ||^(^)||^ + ||2_^(«)||^ < 2||zW||i +211-0;- w;''||i +z/ 
< 27s(A)||z||i +2||w-i(;^||i 



, ,, 2\\w — w^Wi + u 
F 1 < 



l-27s(A) 



We switch now to the properties of approximate solutions x to the problem 



where e > and 



Opt(y) = min {||x||i : \\Ax - y\\ < e} (3.18) 



Aw+^, e G 



with ll^ll < e. We are about to show that in the "non-ideal case", when w is "nearly s-sparse" and (3.18) is 
solved to near-optimality, the error of the £i-recovery remains "under control" - it admits an explicit upper 
bound governed by 7s(^, /3) with a finite /3. The corresponding result is as follows: 
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Theorem 3 Let A be a k x n matrix, s < n be a nonnegative integer, let e > 0, and let (3 £ [0, oo) be such 
that 7 := js{A, /3) < 1/2. Let also w € M", let y in (3.18) be such that \\Aw — y\\ < e, and let be the 
vector obtained from w by zeroing all coordinates except for the s largest in magnitude. Assume, further, 
that X is a {v,v)-optimal solution to (3.18), meaning that 



Then 



\Ax — y\\<v and ||x||i < Opt(y) + z/. (3.19) 



-f^lli < {l-2-^y'^[2P{v + e) + 2\\w-w'\\i + v]. (3.20) 



Proof. Since \\Aw — y\\ < e, w is a feasible solution to (3.18) and therefore Opt(y) < whence, by 

(3.19), 

||x||i < (3.21) 

Let / be the set of indices of entries in . As in the proof of Proposition 1 we denote hy z = x — w the 
error of the recovery, and by x^*) (z^^^) the vector obtained from x (z) by replacing by zero all coordinates 
of X {z) with the indices outside of /. By (2.15) we have 

Ik^'^lli < ll^lls.i < /3Pz|| +7||z||i < /3(u + e) + 7||z||i. (3.22) 

On the other hand, exactly in the same way as in the proof of Proposition 1 we conclude that 

Iklli < 2||z(*)||i + 2||u;-'u;''||i + i/, 

which combines with (3.22) to imply that 

\\z\\i < 2l3{v + e) + 27||z||i + 2\\w - w^Hi + v. 

Since 7 = 7s(^, /3) < 1/2, this results in 

Ml < (1 - 2^Y\2l3{v + e) + 2\\w - w'\\i + v\, 

which is (3.20). | 



The bound (3.20) can be easily rewritten in terms of 7^ yA, jr^j = jr^ < 1 instead of 7 = 7s(A, /?). 

The error bound (3.20) for imperfect ^i-recovery, while being in some respects weaker than the RI- 
based bound (1.7), is of the same structure as the latter bound: assuming /5 < 00 and 7s(A, /3) < 1/2 (or, 
equivalently, 7^(^,2/3) < 1), the error of imperfect ^i-recovery can be bounded in terms of 7s(^,/3), /3, 
measurement error e, "s-tail" llw** — if^lli of the signal to be recovered and the inaccuracy {v, v) to which the 
estimate solves the program (3.18). The only flaw in this interpretation is that we need 7<i(A, /?) < 1/2, while 
the "true" necessary and sufficient condition for s-goodness is 7<i(^) < 1/2. As we know, 7s(A, /?) = ^s{A) 
for all finite "large enough" values of /3, but we do not want the "large enough" values of j3 to be really 
large, since the larger (3 is, the worse is the error bound (3.20). Thus, we arrive at the question "what is 
large enough" in our context. Here are two simple results in this direction. 

Proposition 2 Let A be a k x n sensing matrix of rank k. 

(i) Let II • II = II • ||2- Then for every nonsingular k x k submatrix A of A and every s < k one has 

P>P = a-\A)^/k,-is{A) < 1 7.(A/3) = 7.(^), (3-23) 
where cr{A) is the minimal singular value of A. 

(ii) Let II • II = II • 111, and let for certain p > the image of the unit \\ ■ ||i-6a// in under the mapping 
X I— )■ Ax contain the ball i? = {n G M^' : ||u||i < p}. Then for every s < k 

P>P = -,ls{A) < 1 7s(A/5) = ls{A) (3.24) 
P 
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Proof. Given s, let 7 = 7s(^) < 1, so that for every vector z £ M" with s nonzero entries, equal to ±1, 
there exists y ^M.^ such that {A'^y)i = sign(xj) when Xi ^ and KA-^y)^! < 7 otherwise. All we need is to 
prove that in the situations of (i) and (ii) we have ||y||* < /3. 

In the case of (i) we clearly have ||74-^y||2 < Vk, whence = ||y||2 < 0"~"^(^)||j4'^y||2 < a~^{A)\/% = (3, 
as claimed. In the case of (ii) we have ||A-^y||oo < 1, whence 

1 > maxt, \^v'^A'^y : v G M", ||f ||i < l} = max„ {y'^u '■ u = Av, \\v\\i < l| 
^^^max„ [u^y : u G R'^, ||ii||i < p] = p\\y\\oo = pWvW*, 

(*) 

where (*) is due to the inclusion {u G M'^ : < p} d A{v £ M" : ||f ||i < 1} assumed in (ii). The resulting 
inequality implies that ||y||=K < l/p, as claimed. | 



4 Efficient bounding of 7s( ) 

In the previous section we have seen that the properties of a matrix A relative to ^i-recovery are governed 
by the quantities 75(^5 /3) - the less they are, the better. While these quantities is difficult to compute, 
we are about to demonstrate - and this is the primary goal of our paper - that 7s(A, /3) admits efficiently 
computable "nontrivial" upper and lower bounds. 

4.1 Efficient lower bounding of 7s(A,/3) 

Recall that 7s(A, /?) > 7s (^) for any /3 > 0. Thus, in order to provide a lower bound for 7s(^, /3) it suffices 
to supply such a bound for ^s{A). Theorem 2 suggests the following scheme for bounding ^s{A) from below. 
By (2.16) we have 

^s{A) = max /(u), f(u) = max {x^u : \\x\\i < 1, Ax = 0| . 

Function f{u) clearly is convex and efficiently computable: given u and solving the LP problem 

Xu G Argmax^. {u^x : ||x||i < l,Ax = O} , 

we get a linear form x'^v of v £ Pg which underestimates f{v) everywhere and coincides with f{v) when 
V = u. Therefore the easily computable quantity max„gp^ xj^v is a lower bound on js{A). We now can 
use the standard sequential convex approximation scheme for maximizing the convex function /(•) over Pg. 
Specifically, we run the recurrence 

ut+i G Argmax^gp^x^^u, ui £ P^, 

thus obtaining a nondecreasing sequence of lower bounds f{ut) = x^^Ut on ^s{A). We can terminate this 
process when the improvement in bounds falls below a given threshold, and can make several runs starting 
from randomly chosen points Ui. 

4.2 Efficient upper bounding of 7s(A,/3). 

We have seen that the representation (2.16) suggests a computationally tractable scheme for bounding "fs{A) 
from below. In fact, the same representation allows for a tractable way to bound ^s{A) from above, which 
is as follows. Whatever be a /c x n matrix Y , we clearly have 

max {u^x : ||x||i < l,Ax = £ Pg] = max |u"^(x — Y'^ Ax) : ||x||i < 1, Ax = 0, u G Pg] , 

u,x ^ ' u,x ^ ' 
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whence also 

max {u^x : ||x||i < l,Ax = 0,u £ Pg] < max {u^{x — Y'^Ax) : \\x\\i < 1, u G -Psj . 

u,x u,x 

The right hand side in this relation is easily computable, since the objective in the right hand side problem 
is linear in x, and the domain of x in this problem is the convex hull of just 2n points ztcj, 1 < i < n, where 
Ci are the basic orths: 

max\u^{x -Y'^Ax) : ||x||i < 1, -u G Pg) = max - y'^^)ei| : u £ Pg] 

l<i<n 

= max max {I — Y"^ A)ei\ = max ||(/ — y'^^)ej||s^i. 

l<i<n udPs i ' 



Thus, for aU Y G M'^^", 

Js{A) = max {ti"^x : ||x||i < 1, = 0, u G Psj 

u,x ^ ' 

< fA,s(Y) ■= maxn^[(/ - Y^A)ei] = max ||(/ - y^^)ei||,,i, 
so that when setting as{A, oo) := niin/A,s(^), we get 

%{A) < a,(A,oo). 

Since fA,s{Y) is an easy-to-compute convex function of Y, the quantity as{A, oo) also is easy to compute 
(in fact, this is the optimal value in an explicit LP program with sizes polynomial in k,n). 

This approach can be easily modified to provide an upper bound for ^s{A,(3). Namely, given a k x n 
matrix A and s < k, P £ [0,oo], let us set 

as{A,f3)= min I max || (/ - y^^)ej ||,,i : < /3, 1 < i < n I . (4.25) 

As with 7s, 7s we shorten the notation as{A, oo) to as{A). 

It is easily seen that the optimization problem is (4.25) is solvable, and that as{A, /S) is nondecreasing 
in s, convex and nonincreasing in /3, and is such that as{A, P) = as{A) for all large enough values of /? 
(cf. similar properties of 7s(^, /?)). The central observation in our context is that as{A,/3) is an efEciently 
computable upper bound on ^s{A,s(3), provided that the norm || • || is efficiently computable. Indeed, the 
efficient computability of as{A, P) stems from the fact that this is the optimal value in an explicit convex 
optimization problem with efficiently computable objective and constraints. The fact that is an upper 
bound on 7^ is stated by the following 

Theorem 4 One has 7s(A, s/3) < as{A,(3). 

Proof. Let / be a subset of {1, ...,n} of cardinality < s, z £ M"' be a s-sparse vector with nonzero entries 
equal to ±1, and let / be the support of z. Let Y = [yi, ...,?/«] be such that \\yi\\* < /3 and the columns in 
A = / — Y^A are of the || • ||s,i-norm not exceeding as{A,P). Setting y = Yz, we have ||y||* < /3||2;||i < /3s 
due to \\yj\\* < /3 for all j. Besides this, 

\\z-A^y\\^ = \\{I - A^Y)z\\^ = \\A^z\\^<as{A,P), 

since the || • ||s,i-norms of rows in do not exceed as{A, (3) and z is an s-sparse vector with nonzero entries 
±1. We conclude that 7s(A, s/3) < as{A,l3), as claimed. ■ 

Some comments are in order. 
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A. By the same reasons as in the previous section, it is important to know how large should be /? in order 
to have as{A, (3) = as{A). Possible answers are as follows. Let Ahe a k x n matrix of rank k. Then 

(i) Let II • II = II • ||2- Then for every nonsingular k x k submatrix A A and every s < k one has 

P>P = ^a~\A)Vk, as{A) < 1/2 ^ a,(A, /3) = a,(A), (4.26) 
where (t{A) is the minimal singular value of A. 

(ii) Let II • II = II • 111, and let for certain p > the image of the unit || • ||i-ball in under the mapping 
X I— )■ Ax contain the centered at the origin || • ||i-ball of radius p in M'^. Then for every s < k 

l3>^ = -^,as{A)< 1/2 ^ as{A, p) = a,(^) (4.27) 
2p 

The proof is completely similar to the one of Proposition 2. 

Note that the above bounds on /3 "large enough to ensure as{A, /3) = ag^Ay , same as their counterparts 
in Proposition 2, whatever conservative they might be, are "constructive": to use the bound (4.26), it suffices 
to find a (whatever) nonsingular k x k submatrix of A and to compute its minimal singular value. To use 
(4.27), it suffices to solve k LP programs 

Pi = max |/9 : ||x||i < 1, {Ax)j = pdj, 1 < j < k^ , i = 1, k 
{5l are the Kronecker symbols) and to set p = miuj pi. 



B. Whenever s, t are positive integers, we clearly have ||^;||st,i < s||2;||t,i, whence 

as{A,P) <sai{A,P). (4.28) 

Thus, we can replace in Theorem 4 the quantity as{A, f3) with sai{A,l3). Further, we have ai{A, f3) = 
maxj a* , where 

a* := min{||ei - ^"^yilloo : ||yi||* < /3} , i = 1, n. (4.29) 

On the other hand, we have 

a* = minmaxjfej — ^"^yl,- : ||yj|L < /3| = minmax |(ej — j4"^y)"^x : ||?/i||* < /3, ||x||i < 1} 
y j ' y X 

= maxm.m\e[x — y^Ax : ||yi||^, < /3, ||x||i < l| = max jef x — /3||^x|| : ||x||i < l} < 7i(A, /3), 
and by Theorem 4 we conclude that 

ai(A/3) =7i(A/5), (4.30) 

i.e. the relaxation for 71 (A,/?) is exact. As a compensation for increased conservatism of the bound (4.28), 
note that while both ag and ai are efficiently computable, the second quantity is computationally "much 
cheaper". Indeed, computing ai{A,(3) reduces to solving n convex programs (4.29) of design dimension k 
each. In contrast to this, solving (4.25) with s > 2 seemingly cannot be decomposed in the aforementioned 
manner, while "as it is" (4.25) is a convex program of the design dimension kn. Unless k, n are small, solving 
a single optimization program of design dimension kn usually is much more demanding computationally than 
solving n programs of similar structure with design dimension k each. 
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4.3 Relation to the mutual incoherence condition 



Remarks in B point at some simple, although instructive conclusions. Let Ahe & k x n matrix with nonzero 
columns Aj, 1 < j < n, and let fi{A) be its mutual incoherence, as defined in (1.8).^^ 

Proposition 3 For /3{A) = max |44k- we have 

l<j<n ll^jlb 

^Ja,J^]<^^. (4.31) 

In particular, when ij,{A) < 1 and 1 < s < ^Yp^^' ^'^^ 

Proof. Indeed, with = [Ai/WAiW^, An/WA^WW, the diagonal entries in YJ^ A are equal to 1, while the 
off-diagonal entries are in absolute values < ijl{A)] besides this, the || • ||*-norms of the columns of Y^ do not 
exceed fi{A). Consequently, for y+ = jqi^Tpj^*) the absolute values of all entries in / — Y'^ A are < j^^^-, 

while the || • -norms of columns of 1+ do not exceed ^^j^^- We see that the right hand side in the relation 

aJA,-^^^\ = min Lax |(/ - y'^A)„| : ||y,||, < -^^^1 

does not exceed y+^a^^ since 1+ is a feasible solution for the optimization program in right hand side. This 
implies the bound (4.31). 

To show (4.32) note that from (4.31) with (5 = i|jf(iy and (4.28) we have 

S/i(A) 



as{A,p) < sai(A,/3) < 



and it remains to invoke Theorem 4 and Theorem 1 (ii). 



Observe that taken along with Theorem 3, bound (4.32) recovers some of the results from [17]. 

Proposition 3 implies that computing as{-, ■) allows to infer that a given k x n matrix A is s-good, for 
"reasonably large" values of s. Indeed, take a realization of a random k xn matrix with independent entries 
taking values ±l/^/k with probabilities 1/2. For such a matrix A, with an appropriate absolute constant 
0(1) one clearly has fi{A) < 0{l)^/ln{n)/k with probability > 1/2, meaning that 7s(A, 2s/3(^)) < 1/2 
for s < 0(l)y^/c/ln(n). Note that such verifiable sufficient conditions for s-goodness based on mutual 
incoherence are certainly not new, see [17]. We use them here to show that our machinery does allow 
sometimes to justify s-goodness for "nontrivial" values of s, like 0(\/^/hi(n)). 

4.4 Application to weighted £i-recovery 

Note that ^i-recovery "as it is" makes sense only when A is properly normalized, so that, speaking informally. 
Ax is "affected equally" by all entries in x. In a general case, one could prefer to use a "weighted" £i -recovery 

xaAv) £ Argmin^gjjn {||Az||i : \\Az - y\\ < e} , (4.33) 

^^Note that the "EucUdean origin" of the mutual incoherence is not essential in the following derivation. We could start with 
an arbitrary say, differentiable outside of the origin, norm p(-) on R'°, define ii{A) as ma,x\Ajp' {Ai)\/p{Ai) and define 13(A) as 

max [|p'(Ai)/p(^i)||,, arriving at the same results. 
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where A is a diagonal matrix with positive diagonal entries A,, 1 < i < n, which, without loss of generality, we 
always assume to be < 1. By change of variables x = A~^^, investigating A- weighted £i-recovery reduces to 
investigating the standard recovery with the matrix AA~^ in the role of A, followed by simple "translation" 
of the results into the language of the original variables. For example, the "weighted" version of our basic 
Theorem 3 reads as follows: 

Theorem 5 Let A be a k x n matrix, A be a n x n diagonal matrix with positive entries, s < n be a 
nonnegative integer, and let (5 G [0,oo) be such that 7 := 7s(AA^-'^, /?) < 1/2. Let also w G M", to = Aw, 
and let uj^ be the vector obtained from lo by zeroing all coordinates except for the s largest in magnitude. 
Assume, further, that y in (4.33) satisfies the relation \\Aw — y\\ < e, and that x is a {v, i')-optimal solution 
to (4.33), meaning that 

\\Ax — y\\<v and ||Ax||i < + Opt(?/), 

where Opt(i/) is the optimal value o/(4.33). Then 

||A(x - w)\\i < (1 - 27)-i[2/3(i; + e) + 2\\uo - uj'\\i + v]. (4.34) 

The issue we want to address here is how to choose the scaling matrix A. When our goal is to recover well 
signals with as much nonzero entries as possible, we would prefer to make 7s(j4A~^) < 1/2 for as large s 
as possible (see Theorem 1), imposing a reasonable lower bound on the diagonal entries in A (the latter 
allows to keep the left hand side in (4.34) meaningful in terms of the original variables). The difficulty is 
that 7<j(AA~^,/3) is hard to compute, not speaking about minimizing it in A. However, we can minimize 
in A the efficiently computable quantity as(^A~^,,S), ^ = /3/s, which is an upper bound on s{Ah~^ , 13) . 
Indeed, let 

y = {Y = [yi,...,yn] : \\yi\U <P,l<i<n]. 
Denoting by Ai the columns of A, we have 

as{AK-^,P) = mini max ||ei -y^AjA"^||,,i I 

Yay |^l<i<n J 

= min {a : lie,- — y^AAr"*^!!, 1 < a, 1 < i < n| 
= {'^ '■ IIAiCj — y"^Aj||c,^i < aXi, 1 < i < raj , 

so that the problem of minimizing as{AA~^ , /S) in A under the restriction < £ < A^ < 1 on the diagonal 
entries of A reads 

min |a : WXid - Y'^AAIs 1 < aXi, i < Xi < I, 1 < i < n] . (4.35) 

{x,},a,YGy ^ ' ' 

The resulting problem, while not being exactly convex, reduces, by bisection in a, to a "small series" of 
explicit convex problems and thus is efficiently solvable. In our context, the situation is even better: basically, 
all we want is to impose on the would-be 7^ an upper bound 7s(AA~^,/3) < 7 with a given 7 < 1/2, and 
this reduces to solving a single explicit convex feasibility problem 

find {Ai G [£, l]}^=i andY ey such that ||Aiei - Y'^AiWs,! < jXi, 1 < i < n. 
4.5 Limits of performance 

As we have seen in Section 4.3, the bounding mechanism based on computing as(-,-) allows to certify s- 
goodness of an A: x n-sensing matrix for s as large as Oi^sjkj ln(?i)). Unfortunately, the 0(\/A;)-level of values 
of s is the largest which can be justified via the proposed approach, unless A is "nearly square". 
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4.6 v^-bound 

Proposition 4 For every k x n matrix A with n > 32k, every s, 1 < s < n and every (3 € [0,oo] one has 

3s 



as{A, (3) > mill 



4(s + V2k) 2 



(4.36) 



In particular, in order for as{A, f3) to be < 1/2 (which, according to Theorems 4 md 1, is a verifiable 
sufficient condition for s-goodness of A), one should have s < 2V2k. 

Proof. Let a := as{A,f3); note that a < 1. 
Observe that 

Vw G : Ml < WvWi 1 max[l, (4.37) 

Postponing for a while the proof of (4.37), let us derive (4.36) from this relation. Assume, first, that 
< n. Let Y G M'^^'^ be such that ||[/ - y^A]j||s i < a for all j, where [B]j is j-th column of B. Setting 
Q = I-Y^A, we get a matrix with || • H^^i-norms of columns < a. From (4.37) it follows that the Frobenius 
norm of Q satisfies the relation 



\\Q\\l:=Y.Ql^ 



S2 



Consequently, 

whence, setting 
we get 



2^,2 



^T||2 ^ ^ Q 
S2 



IIQ^II^< 



IWI|2 < "^"^ 

\tl\\p s — — 



as well. Further, the magnitudes of the diagonal entries in Q (and thus - in and in B) are at most a, 
whence Tr(/ — B) > n{l — a). The matrix I — B = ^[Y'^A + A^Y] is of the rank at most 2k and thus has 
at most 2k nonzero eigenvalues. As we have seen, the sum of these eigenvalues is > n(l — a), whence the 
sum of their squares (i.e., ||/ — -f/^Hj;') is at least " ^2fc°^ • have arrived at the relation 

n(l — a) ^ ,, ^ ^ ,, ,, _ na 



^2k s 

whence 

an 



1 



n 3n 



^/2k s\ ^/2k 4^/2A? 

(the concluding inequality is due to n > 32A;), and (4.36) follows. We have derived (4.36) from (4.37) when 
< n; in the case of > n, let s' = [-v/nj > so that s' < s. Applying the just outlined reasoning to s' in 
the role of s, we get as>{A,f3) > ■^^-^^-^=^, and the latter quantity is > 1/2 due to n > 32A; and the origin 



of s'. Since s > s' , we have as{A, 13) > as'{A,l3) > 1/2, and (4.36) holds true. 

It remains to prove (4.37). W.l.o.g. we can assume that vi > V2 > ... > Vn > and ||f ||s,i = 1; let us 
upper bound ||f under these conditions. Setting Vs+i = A, observe that < A < ^ and that for A fixed, 
we have 

||u||2 < max ^ v"^ : Vi = l,Vi > X, 1 < i < s f + {n — s)A^. 
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The maximum of the right hand side is achieved at an extreme point of the set {t; G M'^ : vi = 1, > A}, 
that is, at a point where all but one of ViS are equal to A, and remaining one is 1 — (s — 1)A. Thus, 

\\v\\l < [1 - (s - 1)A]2 + {s- 1)A2 + (n - s)A2 = 1 - 2(s - 1)A + (s^ - 2s + n)X^ 
< max [1 - 2(s - 1 A + (s^ - 2s + n)\^] . 

0<A<l/s 

The maximum in the right hand side is achieved at an endpoint of the segment [0, 1/s], i.e., is equal to 
max[l, n/s^], as claimed. | 



Discussion. Proposition 4 is a really bad news - it shows that our verifiable sufficient condition fails to 
establish s-goodness when s > 0{l)^/k, unless A is "nearly square". This "ultimate limit of performance" 
is much worse than the actual values of s for which a k x n matrix A may be s-good. Indeed, it is well 
known, see, e.g. [9], that a random k x n matrix with i.i.d. Gaussian or ±1 elements is, with close to 1 
probability, s-good for s as large as 0{l)k/\n{n/k). This is, of course, much larger than the above limit 
s < 0{Vk). Recall, however, that we are interested in efRciently veriGable sufficient condition for s-goodness, 
and efficient verifiability has its price. At this moment we do not know whether the "price of efficiency" can 
be made better than the one for the proposed approach. Note, however, that for all known deterministic 
provably s-good k x n matrices s is < 0{l)^fk, provided n'^ k [ ]. 



5 Restricted isometry property and characterization of s-goodness 

Recall that the RI property (1.6) plays the central role in the existing Compressed Sensing results, like the 
following one: For properly chosen absolute constants 6 G (0, 1) and integer k > 1 (e.g., for 6 < — 1, k = 2, 
see [10, Theorem 1.1]), a matrix possessing Kl{6,m) property is s-good, provided that m > ks. By Theorem 
1 it follows that with an appropriate 5 G (0, 1), the RI((5, m)-property of A implies that 7s(j4) < 1, provided 
m > KS. Thus, the RI property possesses important implications in terms of the characterization/verifiable 
sufficient conditions for s-goodness as developed above. While these implications do not contribute to the 
"constructive" part of our results (since the RI property is seemingly difficult to verify), they certainly 
contribute to better understanding of our approach and integrating it into the existing Compressed Sensing 
theory. In this section, we present the "explicit forms" of several of those implications. 



5.1 Bounding 7s(A) for RI sensing matrices 

Proposition 5 Let s he a positive integer, and let A be a k x n matrix possessing the I{1{6, 2s) -property 
with < 5 < \/2 - 1. Then 

js{A)< ^^^^^_^^^ < 1/2 and j,{A)<^<l. (5.38) 

Proof. Observe that by Lemma 2.2 of [ i;], for any vector h G Ker(A) and any index set / of cardinality 
< m/2 we have under the premise of Proposition: 

Y,\hi\ < pY.\^i\' P = V26il-6)-\ 

This implies that for any h G Ker(^) one has < /o(||/i||i — ||/i||s,i), that is, < j^H/iHi- By 

Corollary 1 it follows that ^s{A) < j^{<l/2), and thus -is{A) < p (< 1). I 

Combining Proposition 5 and Theorem 1 , we arrive at a sufficient condition for s-goodness in terms of RI- 
property identical to the one in [10, Theorem 1.1]: a matrix A is s-good if it possesses the RI((5, 2s)-property 
with (5 < 1/2 - 1. 
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The representation (2.15) also allows to bound the value of 7s(A, /?) and corresponding /3 in the case 
when the Restricted Eigenvalue assumption RE(m, p, k) of [1] holds true. The exact formulation of the 
latter assumption is as follows. Let I be an arbitrary subset of indices of cardinality s; for x G W^, let 
be the vector obtained from x by zeroing all the entries with indices outside of /. A sensing matrix A is 
RE(s, p, n) if 

Kis,p) := min<^ ' : x G M", > llx - x^^lli; Card(/) = s > > 0. 

x,i y \W\\2 J 

Note that the condition > \\x — x^'Wi is equivalent to > (1 + p)""^ ||x||i, and ][^f||j > implies 

that < K^-'^-y/s||^x||2. Thus if the RE(s, p, k) assumption holds for A, we clearly have for any x G 

< max <^ , (1 + p) ||x||i ^ . 

In other words, assumption RE(s, p, k) implies that 

%(a^^\<{i + p)-\ 



5.2 "Large enough" values of /3 

We present here an upper bound on the value of (3 such that ^s{A^j3) = 7s (^) in the case when the matrix 
A possesses the Rl-property (cf. Proposition 2): 

Proposition 6 Let s he a positive integer, A be a k x n matrix possessing the Ill{5, 2s) -property with 
< 5 < \f2 — 1 and s < n and let 11 • 11 be the i2-norm. Then 



Proof. The derivations below are rather standard to Compressed Sensing. Let us prove that 



There is nothing to prove when u; = 0; assuming w 7^ 0, by homogeneity we can assume that ||w||i = 1. 
Besides this, w.l.o.g. we may assume that \wi\ > \w2\ > ■■■ > \wn\- Let us set a = \\Aw\\2- Let us split w 
into consecutive s-element "blocks" w^,w^, ...,w'^, so that is obtained from w by zeroing all coordinates 
except for the first s of them, is obtained from w by zeroing all coordinates except of those with indices 
s + l,s + 2,...,2s, and so on, with evident modification for the last block w''. By construction we have 

g 1 

i=o j=o 

Further, we have due to the monotonicity of {wil and s-sparsity of all w^: 

j > I ^ \\w->\\l < ||t(^-'||oo||«^-^'||i < < •s~-^||'u;-'~-^||i. (5-41) 

On the other hand, due to the Rl-property of A and the fact that ||Ai/;||2 = a we have the first inequality 
in the following chain: 
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aVTT5\\w^ + w^\\2 > \\Aw\\2\\A{w^ + w^)\\2>{AwfA{w^+w^) 

= (^u;0 ^ ^Ya^A{w' + w') + f](u;0 + wYa'^Aw^ 

i=2 

<? 

> {I- 6)\\w'^ +w^\\l-^V26\\w'^ + w^\\2\\w^\\2, (5.42) 

where we have used the "classical" Rl-based relation (see [6]) 

v'^A^Au < \/2(5||i;||2||ii||2 

for any two vectors u, v £ M" with disjoint supports and such that u is s-sparse and v is 2s-sparse. Using 
(5.41) we can now continue (5.42) to get 

g-i 

{I - S)\\w" + w^Wl < aVlTd\\w^ + w^\\2 + V25\\w'^ + w^\\2 s-^^'^^Ww^Wi 

i=i 

< aVl + S\\w° + u;^||2 + V2s~^^^6\\w^ + w^hWw - 
Since is s-sparse, we conclude that 

(recall that ||u'||i = 1). It follows that 

- (l + /5)(l-5) ^ 1 + p " i + (V2-l)<^^ l + (^/2-l)<^' 

Recalling that a = ||^t(;||2, the concluding inequality is exactly (5.40) in the case of \w\\ = 1. (5.40) is 
proved. 

Invoking (2.15), (5.40) implies that with || • || = || • II2 and with /3 > one has 7s(j4, /?) < 



1+(V2-1)<5' 



It is worth to note that when using the bounds of Proposition 6 on 7s(^,/3) and the corresponding /3 
along with Theorem 3, we recover the classical bounds on the accuracy of the ^i-recovery as those given in 
[9, 10]. 

5.3 Performance of verifiable conditions for s-goodness in the case of RI sensing ma- 
trices 

It makes sense to ask how conservative is the verifiable sufficient condition for s-goodness "as(A) < 1/2" as 
compared to the difRcult-to-verify RI condition "if A is RI(5, m), then A is s-good for s < 0(l)m". It turns 
out that this conservatism is under certain control, fully compatible with the "limits of performance" of our 
verifiable condition as stated in Proposition 4. Specifically, we are about to prove that if A is RI((^, m), then 
as{A) < 1/2 when s < 0{l)y/rn: our verifiable condition "guarantees at least square root of what actually 
takes place". The precise statement is as follows: 
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Proposition 7 Let a k x n matrix A possess I{l{6,m) -property. Then 



(1 - d)\/m - 1 

so that 

s < - - I ^ < < 1/2. (5.44) 

Proof, l''. We start with the following simple fact (cf. Proposition 5): 
Lemma 1 Let A possess I{l{6,m) -property. Then 

liiA) < ■ (5.45) 

(1 — d)y/m — 1 

Proof. Invoking Theorem 2, all we need to prove is that under the premise of Lemma for every s, 1 < s < m, 
and for every w £ Ker(74) we have 

Iklloo = ||w||i,i < 7lklli- 

To prove this fact we use again the standard machinery related to the Rl-property (cf proof of Proposition 
6): we set t = [m/2\, assume w.l.o.g. that \\w\\i = 1, \wi\ > \w2\ > ... > \wn\ and split w into q consecutive 
"blocks" so that the cardinahty of the "blocks" is 1, t - l,t, t, .... I.e. the first "block" G is the 
vector such that = wi and all other coordinates vanish, is obtained from w by zeroing all coordinates 
except of those with indices 2, 3, t, w'^ is obtained from w by zeroing all coordinates except of those with 
indices t + l,...,2t, and so on, with evident modification for the last vector w'^. Acting as in the proof of 
Proposition 6, and using the relation (see [6]) 

v^A^Au < 5\\v\\2\\u\\2 

for any two 4-sparse vectors u, v £ M", t < m/2, with disjoint supports, we obtain 

= {A{w'^ + w^)fAw > {l-6)\\w° + w'^\\l-t"'^/^5\\w° + w^\\2 

whence 

l^^il < Ww^ + ^"^Ib < 



(l-<5)Vi' 

what is (5.45). | 

2". Now we are ready to complete the proof of (5.43). We already know that as{A) < sai{A), so all we 

need is to verify (5.43). The latter is readily given by (4.30) combined with (5.45). ■ 



6 Numerical illustration 

We are about to present some very preliminary numerical results for relatively small sensing matrices. 

The data. In the two series of experiments presented below we deal with sensing matrices of row dimension 
n = 256 and n = 1024. 

For n = 256 we generate three sets of random matrices of column dimension m = O.ln, 0.9n: Gaussian 
matrices, with the i.i.d. normal entries, Fourier matrices, which are m rows of the Fourier basis on [0, 1] 
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drawn at random and, finally, Hadamard matrices, which are, again, random m x n cuts from the n x n 
Hadamard matrix.'^ Then all matrices are normalized so that their columns have unit ^2-iiorm. 

For n = 1024 we provide the result of an experiment with a family of Gaussian matrices of column 
dimension m = O.ln, 0.9n and with a 992 x 1024 matrix ^conv which is constructed as follows. Let us 
consider a signal x "living" on and supported on the 32 x 32 grid T = S : < i,j < 31}. We 

subject such a signal to discrete time convolution with a kernel supported on the set G Z^ : —7 < 

i,j < 7}, and then restrict the result on the 32 x 31 grid r_|_ = G T : 1 < j < 31}. This way we obtain 

a linear mapping x i— >■ ^conv^^ : M^^^'^ — )• M^^^. 

The goal of the experiment is to bound from below and from above the maximal s for which the m x n 
matrix A in question is s-good (the quantity from Definition 1). 

The lower bound on was obtained via bounding from above, for various s, the quantity as{A) and 

invoking Theorem 4 and Theorem 1 (ii) which, taken together, state that a sufficient condition for A to be 
s-good is as{A) < 1/2. 

We provide two lower bounds for s^,{A). The first bound is obtained using the upper bound as{A) < 
sai{A) (see Comment B in Section 4). When the upper bound sai{A) for as{A) is computed and turns out 
to be < 1/2, we know that A is s-good, and our lower bound on s^:{A) is the largest s for which this situation 
takes place; note that computing this bound reduces to a single computation of ai{A). As explained in 
Comment B, this computation reduces to solving n convex programs of design dimension m each, and these 
programs are easily convertible to LP's with (2n + 1) x (m + 1) constraint matrices. These LP's were solved 
using the commercial LP solver mosekopt [1]. Note that in fact computing ai{A) allows to somehow improve 
the trivial upper bound sai{A) on as{A), specifically, as follows. As a result of computing ai{A), we get 
the associated matrix Y; the largest of || • ||s^i-norms of the columns of I — Y'^A clearly is an upper bound 
on as{A), and this bound is at worst sai{A). 

For "small" matrices with the row dimension n = 256 we also provide the "improved" lower bound, 
obtained using the computation of as{A) itself. We act as follows: when the bound s(ai) is computed, 
verify if the value s(ai) + 1 can be certified lower bound for s*(A) using the computation of as{A). If this 
bound is certified we proceed with s(ai) + 2, and so on. Note that, exactly as it is in the case of ai{A), 
computing as{A) allows to improve the lower bound on s^,(A) in the case when as{A) < 1/2. Indeed, as a 
result of computing as{A), we get the associated matrix Y; the largest s such that the || • ||s,i-norms of the 
columns of / — Y'^A is < 1/2 clearly is a lower bound on s* (A) . 

We would like to add here two words about the techniques used to compute the corresponding bound 
(being of interest by themselves, these techniques are the subject of a separate paper). While as{A) is 
efficiently computable via LP (when /3 = oo, the optimization program in (4.25) is easily convertible into 
a linear programming one), the sizes of the resulting LP are rather large - when A is m x n, the LP 
reformulation of (4.25) has a (2n^ + ?i) x (n(?n + n+ 1) + 1) constraint matrix. For instance, for m = 230 and 
n = 256, the size of the LP becomes 131,328x127,233, and we preferred to avoid solving this, by no means 
small, LP program using the interior-point solver available with mosekopt. Instead, the LP is reformulated 
as a saddle-point problem and is solved using an implementation of the non-Euclidean mirror-prox algorithm, 
described in [20]. 

The upper bound on s*(A) is computed using the lower bound on 7s (A) by the Sequential Convex 
Approximation algorithm presented in Section 4.1. 

The results of our experiments are presented in Tables 1 and 2. The computations we run on an Intel 
P9500@2.53GIIz CPU (the computations were running single-core). We present along with the results the 

Hadamard matrix Hi of order n = 2*^ is the orthogonal matrix with entries ±1 given by the recurrence Ho = 1, -ff^+i = 
[Hi, Hr, Hi, -Hi]. 
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corresponding CPU usage. 

We would like to add the following comment: our efficiently computable lower bounds on s^:{A) outper- 
form significantly those based on mutual incoherence. Further, these lower and upper bounds "somehow" 
work in the case of the randomly chosen sensing matrix and work quite well in the case of the convolution 
matrix. While the gap between the lower and the upper bound in the case of the random sensing matrix 
could be better, we can re-iterate at this point our remark that computability has its price. 
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A Proof of Theorem 1 

Proof, (i): a) Assume that A is s-good, and let us prove that 7s(^) < 1- Let / be an s-element subset of the 
index set {1, n} and / be its complement, and let w be a vector supported on / and with nonzero Wi, i £ I. 
Then w should be the unique solution to the LP problem (2.9). From the fact that w is an optimal solution 
to this problem it follows, by optimality conditions, that for certain y the function fy{x) = \\x\\i — y^ Ax 
attains its minimum over x G M" sX x = w, meaning that G dfy{w), that is. 



so that the LP problem 



= sign('u;i), iG/ 
G [-1,1], iG / ' 



has optimal value < 1. Let us prove that in fact the optimal value is < 1. Indeed, assuming that the optimal 
value is exactly 1, there should exist Lagrange multipliers {^j : i G /} and {f^^ > : i G /} such that the 
function 

7 + HWy^ - 7] + i^n-{A''y)i - 7]] - [{A^y)i - sign(«;,)] 

has unconstrained minimum in j,y equal to 1, meaning that 



(«) Eie/k+ + ^r] = i, 

(b) Eie/'"isign('u;i) = 1, 

(c) Ad = 0, where deW with di 



-fii, i G / 



Now consider the vector xt = w + td, where t > 0. This is a feasible solution to (2.9) due to (c); the 
II • Ill-norm of this solution is 

\wi - t^i\ +t'^\uf -v^\ < ^ \wi - tfj.i\ + 1 

i&I iel i&I 
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where the concluding inequahty is given by (a) and the fact that > 0. Since Wi ^ for i £ I, for smah 
positive t we have 

\u!i - tiJ.i\ = ^ \wi\ - t'^ iJ,isign{wi) = ^ \wi\ - t, 

where the concluding equality is given by (b). We see that xt is feasible for (2.9) and \\xt\\i < \\w\\i for all 
small positive t. Since w is the unique optimal solution to (2.9), we should have Xt = w, t > 0, which would 
imply that /ij = for all i; but the latter is impossible by (b). Thus, the optimal value in (A. 46) is < 1. 

We see that whenever x is a vector with s nonzero entries, equal to ±1, there exists y such that {A'^y)i = 
Xi when Xj 7^ and | {A'^y)i\ < 1 when Xi = (indeed, in the role of this vector one can take the y-component 
of an optimal solution to the problem (A. 46) coming from w = x), meaning that 7s(^) < 1, as claimed. 

b) Now assume that 7s (^) < 1, and let us prove that A is s-good. Thus, let w be an s-sparse vector; we 
should prove that w is the unique optimal solution to (2.9). There is nothing to prove when w = 0. Now 
let w ^ 0, let s' be the number of nonzero entries of w, and I be the set of indices of these entries. By C 
we have 7 := 7s'(^) < js{A), i.e., 7 < 1. Recalling the definition of 7s(-)) there exists y E R*^ such that 
{A'^y)i = sign(u;j) when Wi and |(A"^y)j| < 7 when Wi = 0. The function 

f{x) = \\x\\i - y'^[Ax - Aw] = ^ [\xi\ - sign{wi){xi - Wi)] + ^ [\xi\ - 7iXi] , 7^ = {A^y)i, i ^ /, 

coincides with the objective of (2.9) on the feasible set of (2.9). Since < 7 < 1, this function attains 
its unconstrained minimum m. x aX, x = w. Combining these two observations, we see that x = li; is an 
optimal solution to (2.9). To see that this optimal solution is unique, let x' be another optimal solution to 
the problem. Then 

= f{x') - f{w) = ^ [\x'i\ - sign{wi){x'i - Wj) - \wi\] +Y^ [\x[\ - -fix'^] ; 

since |7i| < 1 for i ^ I, we conclude that x[ = for i ^ I. This conclusion combines with the relation 
Ax' = Aw to imply the required relation x' = w, due to the following immediate observation: 

Lemma 2 If ^s{A) < 1, then every k x s submatrix of A has trivial kernel. 

Proof. Let / be the set of column indices of a /c x s submatrix of A. If this submatrix has a nontrivial 
kernel there exists a nonzero s-sparse vector z E M" such that Az = 0. Let / be the support set of z. By A, 
there exists a vector y E M*^ such that {A'^y)i = sign(zj) whenever i £ I, that is 

= y'^Az = {A^y)iZi = ||z||i, 

which is impossible. ■ 

(ii) Let 7 := 7s(j4,/3) < 1. By definition it means that for every vector z E M'^ with s nonzero entries, 
equal to ±1, there exists y, ||y||^, < /3, such that A^y coincides with z on the support of z and is such that 
WAJ'y — z\\oo < 7- Given z, y as above and setting y' = j^y-, we get ||y'||* < jipzP and 



A'^y' — z\\oo < max 



1 7 



1 + 7 1 + 7 



7 



1+7 



Thus, for every vector z with s nonzero entries, equal to ±1, there exists y' such that ||y'||* < j^f^ and 
\\A^y' - z||oo < meaning that 7 := 7s(A, /3) < 1 implies 

f^' -T-^] ^-r—< V2- (A.47) 
V 1+7 J 1+7 
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Now assume that 7 •= 7s(^>/5) < 1/2. For an s-element subset / of the index set {1, ...,n}, let 

H/ = |m G M" : exists y G M'^ : ||y||* < /?, {A^y)i = Ui for i E I, < 7 for i G /} , 

where / is the complement of /. It is immediately seen that 11/ is a closed and convex set in M". Let B 
be the centered at the origin || * HoQ-ball of the radius 1 — 7 M^; B — \u G ; ||'^||oo ^ 1 — t}* 
claim that 11/ contains B. Using this fact we conclude that for every vector z supported on / with entries 
Zi, i £ I, equal to ±1, there exists an u G 11/ such that Ui = (1 — 'j)zi, i £ I. Recalling the definition of 11/, 
we conclude that there exists y with ||y||^ < (1 — T')""'^/? such that {A'^y)i = (1 — 7)~^tij = for i G / and 
< (1 — 7)~^7 for i ^ I. Thus, the validity of our claim would imply that 

7 := 7.(^, (3) < 1/2 Is (a, -r^f^] < < 1. (A.48) 

V 1 - 7 / 1-7 

Let us prove our claim. Observe that by definition 11/ is the direct product of its projection Q on the plane 
L/ = {it G : Uj = 0,z /} and the entire orthogonal complement Lf = {u G : ttj = 0, i G /} of L/; 
since 11/ is closed and convex, so is Q. Now, L/ can be naturally identified with M**, and our claim is exactly 
the statement that the image Q C of Q under this identification contains the centered at the origin || • ||oo 
ball Bs, of the radius 1 — 7, in M^. Assume that it is not the case. Since Q is convex and Bg ^ Q, there exists 
V G Bs\Q, and therefore there exists a vector e G M*, ||e||i = 1 such that e^v > max^/gg e^v' (recall that Q, 
and thus Q, is both convex and closed). Now let z G be the s-sparse vector supported on / such that the 
entries of z with indices i £ I are the signs of the corresponding entries in e. By definition of 7 = jsiA, /3), 
there exists y G M*^ such that \\y\\* < /3 and H^^^y — ^Hoo < 7! recalling the definition of 11/ and Q, this 
means that Q contains a vector v with \vj — sign(ej)| < 7, 1 < j < s, whence e^v > ||e||i — 7||e||i = 1 — 7. 
We now have 

^ 1 1 1 1 T' T' — ^ 

1 — 7 > \\v\\oo > e V > e w>l — 7, 

where the first > is due to u G i?s, an > is due to the origin of e. The resulting inequality is impossible, 
and thus our claim is true. 

We have proved the relations (A. 47), (A.48) which are slightly weakened versions of (2. 14. 0-6). It remains 
to prove that the inequalities < in the conclusions of (A. 47), (A.48) are in fact equalities. This is immediate: 
assume that under the premise of (2. 14. a) we have 

7 := 7, (a, -^/3 ) < 7+ := ^ 



1 +7 / 1+7 
When applying (A.48) with /5 replaced with j^f!^, we get 



Is {A, ^ 



1-7 



1 + 7 



< ^ < = 7- (A.49) 

1-7 1-7+ 



At the same time, Yr^jrp^ < = 1 due to 7 < 7+; since js{A,-) is nonincreasing by B, we see that 



7s A, 



1-7 



1 + 7 



>lsiA,l3), 



and thus (A.49) implies that 7s (^, /?) < 7, which contradicts the definition of 7. Thus, the concluding < in 
(A. 47) is in fact equality. By completely similar argument, so is the concluding < in (A.48). | 
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Gaussian matrix 



m 


lower bounds on s*(A) 


upper 
bound s 


CPU time (s) 






s[as\ 




s[as\ 


s 


25 


1 


1 


1 


1 


11.0 


21.6 


3.4 


51 


1 


2 


2 


4 


22.3 


24.1 


8.8 


76 


1 


3 


3 


7 


34.2 


34.3 


23.1 


102 


1 


3 


4 


11 


50.8 


190.7 


34.0 


128 


1 


5 


5 


15 


69.3 


75.8 


31.6 


153 


1 


5 


6 


19 


93.8 


557.6 


60.7 


179 


2 


7 


8 


25 


115.4 


658.3 


103.8 


204 


2 


9 


11 


31 


141.2 


551.5 


97.8 


230 


2 


14 


17 


41 


173.0 


561.0 


97.8 



Random Fourier matrix 



m 


lower bounds on s^:{A) 


upper 
bound s 


CPU time (s) 


s[fj] 




s[as] 




s[as] 


s 


24 


1 


1 


1 


2 


9.3 


6.1 


1.3 


51 


1 


2 


2 


4 


129.5 


14.5 


7.2 


76 


2 


3 


3 


6 


233.1 


12.8 


16.1 


102 


2 


4 


4 


7 


213.9 


11.2 


25.6 


128 


2 


4 


4 


8 


270.9 


426.5 


58.1 


152 


3 


5 


5 


10 


245.9 


2350.7 


57.8 


178 


3 


6 


6 


14 


319.7 


161.2 


81.5 


204 


4 


7 


7 


14 


234.0 


97.9 


75.8 


230 


4 


9 


9 


19 


343.2 


76.0 


51.9 



Random Hadamard matrix 



m 


lower bounds on s^{A) 


upper 
bound s 


CPU time (s) 


s[fi] 


s[ai] 


s[as] 


s[ai] 


s[as] 


s 


25 


1 


1 


1 


2 


10.1 


7.4 


1.2 


51 


1 


2 


2 


4 


21.6 


11.7 


3.5 


76 


2 


3 


3 


4 


34.1 


14.2 


6.7 


102 


3 


4 


4 


11 


50.8 


23.8 


37.7 


128 


3 


5 


5 


7 


69.6 


48.5 


24.1 


153 


3 


7 


7 


11 


93.8 


31.1 


84.7 


179 


4 


9 


9 


15 


112.0 


51.0 


88.9 


204 


5 


12 


12 


15 


141.6 


51.1 


78.6 


230 


6 


18 


18 


28 


141.5 


55.4 


44.1 



Table 1: Efficiently computable bounds on s*(^), n = 256. 
Lower bound s[^]: the bound (4.32) based on mutual incoherence; s[ai]-bound: the "improved" bound based on upper 
bounding of as{A) via the matrix Y obtained when computing ai{A); s[as]: the bound based on computing as{A). Upper 
bound s: the bound based on successive convex approximation 
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Gaussian matrix 



m 


lower bounds on s*{A) 


upper 
bound s 


CPU time (s) 


s[fi] 


s[ai] 


s[ai] 


s 


102 


2 


2 


8 


457.0 


400.7 


204 


2 


4 


18 


1179.0 


1722.1 


307 


2 


6 


30 


2234.6 


7585.9 


409 


3 


7 


44 


3658.6 


3421.7 


512 


3 


10 


61 


5341.7 


6304.3 


614 


3 


12 


78 


7155.7 


17616.7 


716 


3 


15 


105 


9446.1 


11670.4 


819 


4 


21 


135 


12435.1 


8373.1 


921 


4 


32 


161 


13564.2 


9838.3 


Convolution matrix 


m 


lower bounds on s*(^) 


upper 
bound s 


CPU time (s) 


s[fi] 


s[ai] 


s[ai] 


s 


960 





5 


7 


4579.1 


271.8 



Table 2: Efficiently computable bounds on n = 1024. 

Lower bound s[fj,]: the bound (4.32) based on mutual incoherence; s[ai]-bound: the "improved" bound based on upper 
bounding of as{A) via the matrix Y obtained when computing ai{A). Upper bound s: the bound based on successive convex 
approximation 
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